The Linguistics Journal Volume 4 Issue 1 the First Paper on " Part-of-speech Tagging for Grammar Checking of Punjabi " Part-of-speech Tagging for Grammar Checking of Punjabi Noun and Modifier Agreement

نویسندگان

  • John Adamson
  • Joseph Jung
چکیده

Part-of-speech (POS) tagging is one of the major activities performed in a typical natural language processing application. This paper explores part-of-speech tagging for the Punjabi language, a member of the Modern Indo-Aryan family of languages. A tagset for use in grammar checking and other similar applications is proposed. This fine-grained tagset is based entirely on the grammatical categories involved in various types of concord in typical Punjabi sentences. The morpho-syntactic features taken in this tagset are largely based on the inflectional morphology of Punjabi words. The motivation behind devising this tagset, with focus on agreement features of these languages, is that The Linguistics Journal Volume 4 Issue 1 The Linguistics Journal Volume 4 Issue 1 7 there is no tagset available for Punjabi or other Indian languages. The tagsets for other languages do not cover all the grammatical features, which are required for agreement checking in Punjabi texts. A rule-based tagger derived from this tagset is also described. This will be the first published POS tagger for Punjabi. The tagset described in this paper is recommended for grammar checking and other similar applications for the languages sharing grammatical features with Punjabi, more specifically the languages of the Modern Indo-Aryan family.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Grammar Checking System for Punjabi

This article provides description about the grammar checking system developed for detecting various grammatical errors in Punjabi texts. This system utilizes a fullform lexicon for morphological analysis, and applies rule-based approaches for part-of-speech tagging and phrase chunking. The system follows a novel approach of performing agreement checks at phrase and clause levels using the gramm...

متن کامل

A Punjabi Grammar Checker

This article provides description about the grammar checking software developed for detecting the grammatical errors in Punjabi texts and providing suggestions wherever appropriate to rectify those errors. This system utilizes a full-form lexicon for morphology analysis and rule-based systems for part of speech tagging and phrase chunking. The system supported by a set of carefully devised erro...

متن کامل

Studying impressive parameters on the performance of Persian probabilistic context free grammar parser

In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...

متن کامل

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

برچسب‌گذاری ادات سخن زبان فارسی با استفاده از مدل شبکۀ فازی

Part of speech tagging (POS tagging) is an ongoing research in natural language processing (NLP) applications. The process of classifying words into their parts of speech and labeling them accordingly is known as part-of-speech tagging, POS-tagging, or simply tagging. Parts of speech are also known as word classes or lexical categories. The purpose of POS tagging is determining the grammatical ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009